Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 25
Filtrar
1.
Cell ; 186(25): 5587-5605.e27, 2023 12 07.
Artigo em Inglês | MEDLINE | ID: mdl-38029745

RESUMO

The number one cause of human fetal death are defects in heart development. Because the human embryonic heart is inaccessible and the impacts of mutations, drugs, and environmental factors on the specialized functions of different heart compartments are not captured by in vitro models, determining the underlying causes is difficult. Here, we established a human cardioid platform that recapitulates the development of all major embryonic heart compartments, including right and left ventricles, atria, outflow tract, and atrioventricular canal. By leveraging 2D and 3D differentiation, we efficiently generated progenitor subsets with distinct first, anterior, and posterior second heart field identities. This advance enabled the reproducible generation of cardioids with compartment-specific in vivo-like gene expression profiles, morphologies, and functions. We used this platform to unravel the ontogeny of signal and contraction propagation between interacting heart chambers and dissect how mutations, teratogens, and drugs cause compartment-specific defects in the developing human heart.


Assuntos
Cardiopatias , Ventrículos do Coração , Coração , Humanos , Transcriptoma/genética , Linhagem Celular , Regulação da Expressão Gênica no Desenvolvimento , Cardiopatias/genética , Cardiopatias/metabolismo
2.
Channels (Austin) ; 17(1): 2192360, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-36943941

RESUMO

Cav1.4 L-type calcium channels are predominantly expressed at the photoreceptor terminals and in bipolar cells, mediating neurotransmitter release. Mutations in its gene, CACNA1F, can cause congenital stationary night-blindness type 2 (CSNB2). Due to phenotypic variability in CSNB2, characterization of pathological variants is necessary to better determine pathological mechanism at the site of action. A set of known mutations affects conserved gating charges in the S4 voltage sensor, two of which have been found in male CSNB2 patients. Here, we describe two disease-causing Cav1.4 mutations with gating charge neutralization, exchanging an arginine 964 with glycine (RG) or arginine 1288 with leucine (RL). In both, charge neutralization was associated with a reduction channel expression also reflected in smaller ON gating currents. In RL channels, the strong decrease in whole-cell current densities might additionally be explained by a reduction of single-channel currents. We further identified alterations in their biophysical properties, such as a hyperpolarizing shift of the activation threshold and an increase in slope factor of activation and inactivation. Molecular dynamic simulations in RL substituted channels indicated water wires in both, resting and active, channel states, suggesting the development of omega (ω)currents as a new pathological mechanism in CSNB2. This sum of the respective channel property alterations might add to the differential symptoms in patients beside other factors, such as genomic and environmental deviations.


Assuntos
Oftalmopatias Hereditárias , Miopia , Cegueira Noturna , Humanos , Masculino , Canais de Cálcio Tipo L/genética , Canais de Cálcio Tipo L/metabolismo , Cegueira Noturna/metabolismo , Oftalmopatias Hereditárias/metabolismo , Miopia/metabolismo , Cálcio/metabolismo
3.
PLoS One ; 17(11): e0276607, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36350811

RESUMO

High throughput technologies in genomics enable the analysis of small alterations in gene expression levels. Patterns of such deviations are an important starting point for the discovery and verification of new biomarker candidates. Identifying such patterns is a challenging task that requires sophisticated machine learning approaches. Currently, there are a variety of classification models, and a common approach is to compare the performance and select the best one for a given classification problem. Since the association between the features of a data set and the performance of a particular classification method is still not fully understood, the main contribution of this work is to provide a new methodology for predicting the prediction results of different classifiers in the field of biomarker discovery. We propose here a three-steps computational workflow that includes an analysis of the data set characteristics, the calculation of the classification accuracy and, finally, the prediction of the resulting classification error. The experiments were carried out on synthetic and microarray datasets. Using this method, we showed that the predictability strongly depends on the discriminatory ability of the features, e.g., sets of genes, in two or multi-class datasets. If a dataset has a certain discriminatory ability, this method enables prediction of the classification performance before applying a learning model. Thus, our results contribute to a better understanding of the relationship between dataset characteristics and the corresponding performance of a machine learning method, and suggest the optimal classification method for a given dataset based on its discriminatory ability.


Assuntos
Perfilação da Expressão Gênica , Genômica , Perfilação da Expressão Gênica/métodos , Fluxo de Trabalho , Biomarcadores Tumorais , Aprendizado de Máquina
4.
Stud Health Technol Inform ; 293: 137-144, 2022 May 16.
Artigo em Inglês | MEDLINE | ID: mdl-35592973

RESUMO

BACKGROUND: Process mining is a promising field of data analytics that is yet to be applied broadly in healthcare. It can streamline the care process, leading to a higher quality of care, increased patient safety and lower costs. OBJECTIVES: To get deeper insights into the emergence and detectability of delirium in a gerontopsychiatric setting. METHODS: We use process mining to create process models from routinely collected, anonymised nursing data from two gerontopsychiatric wards. We analyse these models to get a longitudinal view of the care processes. RESULTS: The process models comprise all activities during patients' stays but are too extensive and challenging to interpret due to the wide variation in care paths. Although the models give insight into frequent paths and activities, they are insufficient to explain the emergence of delirium meaningfully. No apparent difference between stays with or without delirium could be detected. CONCLUSION: Conducting process mining on routinely collected data is easy, but the interpretation of the results was a challenge. We identified four limitations associated with using this data and gave recommendations on adapting it for further analysis.


Assuntos
Delírio , Hospitais , Mineração de Dados/métodos , Delírio/diagnóstico , Atenção à Saúde , Humanos , Segurança do Paciente
5.
Stud Health Technol Inform ; 279: 54-61, 2021 May 07.
Artigo em Inglês | MEDLINE | ID: mdl-33965919

RESUMO

Hydrogen breath tests are a well-established method to help diagnose functional intestinal disorders such as carbohydrate malabsorption or small intestinal bacterial overgrowth. In this work we apply unsupervised machine learning techniques to analyze hydrogen breath test datasets. We propose a method that uses 26 internal cluster validation measures to determine a suitable number of clusters. In an induced external validation step we use a predefined categorization proposed by a medical expert. The results indicate that the majority of the considered internal validation indexes was not able to produce a reasonable clustering. Considering a predefined categorization performed by a medical expert, a novel shape-based method obtained the highest external validation measure in terms of adjusted rand index. The predefined clusterings constitute the basis of a supervised machine learning step that is part of our ongoing research.


Assuntos
Infecções Bacterianas , Testes Respiratórios , Análise por Conglomerados , Humanos , Hidrogênio , Aprendizado de Máquina não Supervisionado
6.
Stud Health Technol Inform ; 279: 147-148, 2021 May 07.
Artigo em Inglês | MEDLINE | ID: mdl-33965932

RESUMO

BACKGROUND: Delirium is a patient safety issue that often occurs within the population of elderly people. As delirium may be characterized by fluctuating progress, the aim of this work is to find methods to visualize the occurrence of delirium over time in different patient stays in gerontopsychatric settings. METHODS: We analyzed current data mining visualization techniques for clinical research using a delirium data set collected in a gerontopsychatric setting. RESULTS: We identified heatmaps and dendrograms resulting from hierarchical clustering as a suitable visualization method. CONCLUSION: Heat maps with hierarchical clustering are a suitable data mining tool or visualization technique to study delirium cases in the time course of patient stays.


Assuntos
Mineração de Dados , Delírio , Idoso , Análise por Conglomerados , Humanos
7.
Sci Rep ; 11(1): 2732, 2021 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-33526839

RESUMO

CaV1.4 L-type calcium channels are predominantly expressed in photoreceptor terminals playing a crucial role for synaptic transmission and, consequently, for vision. Human mutations in the encoding gene are associated with congenital stationary night blindness type-2. Besides rod-driven scotopic vision also cone-driven photopic responses are severely affected in patients. The present study therefore examined functional and morphological changes in cones and cone-related pathways in mice carrying the CaV1.4 gain-of function mutation I756T (CaV1.4-IT) using multielectrode array, patch-clamp and immunohistochemical analyses. CaV1.4-IT ganglion cell responses to photopic stimuli were seen only in a small fraction of cells indicative of a major impairment in the cone pathway. Though cone photoreceptors underwent morphological rearrangements, they retained their ability to release glutamate. Our functional data suggested a postsynaptic cone bipolar cell defect, supported by the fact that the majority of cone bipolar cells showed sprouting, while horizontal cells maintained contacts with cones and cone-to-horizontal cell input was preserved. Furthermore a reduction of basal Ca2+ influx by a calcium channel blocker was not sufficient to rescue synaptic transmission deficits caused by the CaV1.4-IT mutation. Long term treatments with low-dose Ca2+ channel blockers might however be beneficial reducing Ca2+ toxicity without major effects on ganglion cells responses.


Assuntos
Canais de Cálcio Tipo L/metabolismo , Células Fotorreceptoras Retinianas Cones/metabolismo , Vias Visuais/fisiologia , Animais , Canais de Cálcio Tipo L/genética , Forma Celular/fisiologia , Camundongos , Camundongos Transgênicos , Retina/citologia , Retina/metabolismo , Células Fotorreceptoras Retinianas Cones/citologia , Sinapses/metabolismo , Transmissão Sináptica/fisiologia
8.
Stud Health Technol Inform ; 271: 67-68, 2020 Jun 23.
Artigo em Inglês | MEDLINE | ID: mdl-32578543

RESUMO

BACKGROUND: The Community of Inquiry (CoI) describes success factors for online-based learning. OBJECTIVES: To develop approaches for automatic analysis of CoI to be visualized within student and teacher dashboards. METHODS: Extending indicators from social network analysis and linguistics; evaluation within a case study. RESULTS: The project is just starting. CONCLUSION: Results will help to better understand and improve cooperative online-based learning in higher education.


Assuntos
Educação a Distância , Aprendizagem , Humanos , Estudantes , Ensino
9.
Stud Health Technol Inform ; 271: 121-128, 2020 Jun 23.
Artigo em Inglês | MEDLINE | ID: mdl-32578554

RESUMO

Delirium is an acute mental disturbance that particularly occurs during hospital stay. Current clinical assessment instruments include the Delirium Observation Screening Scale (DOSS) or the Confusion Assessment Method (CAM). The aim of this work is to analyze the performance of machine learning approaches to detect delirium based on DOSS and CAM information obtained from two geropsychiatric wards in Tyrol. From a machine learning perspective, the questions of these two assessment instruments represent the features and the ICD 10 diagnoses of delirium (yes/no) is the corresponding class variable. We compare seven popular classification methods and analyze the performance and interpretability of the learning models. As our dataset is highly imbalanced, we also evaluate the effect of common sampling methods including down- and up-sampling methods as well as hybrid methods. Our results indicate a high predictive ability of advanced methods such as Random Forest that can handle even unbalanced datasets. Overall, combining a good performance of a prediction model with the ability of users to understand the prediction is challenging. However, for clinical application in fully electronic settings, a good performance seems to be more important than an easy interpretation of the prediction by the user. On the other hand, explanations of decisions are often needed to assess other criteria such as safety.


Assuntos
Aprendizado de Máquina , Delírio , Humanos , Classificação Internacional de Doenças
10.
Stud Health Technol Inform ; 262: 87-90, 2019 Jul 04.
Artigo em Inglês | MEDLINE | ID: mdl-31349272

RESUMO

Socio-constructive instructional designs for online-based learning focus on interaction and communication of students to allow in-depth learning. The objective of this study is to analyze whether increased interaction of students in online-based learning settings may contribute to better outcome. We developed indicators for presence, participation, and interactivity of students. We extracted log data from the learning management system for 31 students in 10 online courses (n=123 course attendances). We correlated indicators to final grades and also applied a decision tree based machine learning approach. We found only weak to moderate correlations between the indicators and final grades, but acceptable results concerning prediction of students' success based on the indicators. Our results support the theory that student presence and participation in online-based courses is related to learning outcome.


Assuntos
Educação a Distância , Humanos , Estudantes
11.
Stud Health Technol Inform ; 260: 89-96, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-31118323

RESUMO

BACKGROUND: Machine learning is one important application in the area of health informatics, however classification methods for longitudinal data are still rare. OBJECTIVES: The aim of this work is to analyze and classify differences in metabolite time series data between groups of individuals regarding their athletic activity. METHODS: We propose a new ensemble-based 2-tier approach to classify metabolite time series data. The first tier uses polynomial fitting to generate a class prediction for each metabolite. An induced classifier (k-nearest-neighbor or naïve bayes) combines the results to produce a final prediction. Metabolite levels of 47 individuals undergoing a cycle ergometry test were measured using mass spectrometry. RESULTS: In accordance with our previous work the statistical results indicate strong changes over time. We found only small but systematic differences between the groups. However, our proposed stacking approach obtained a mean accuracy of 78% using 10-fold cross-validation. CONCLUSION: Our proposed classification approach allows a considerable classification performance for time series data with small differences between the groups.


Assuntos
Aprendizado de Máquina , Informática Médica , Metabolômica , Algoritmos , Teorema de Bayes , Humanos
12.
PLoS Comput Biol ; 11(8): e1004454, 2015 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-26317529

RESUMO

The objectives of this work were the classification of dynamic metabolic biomarker candidates and the modeling and characterization of kinetic regulatory mechanisms in human metabolism with response to external perturbations by physical activity. Longitudinal metabolic concentration data of 47 individuals from 4 different groups were examined, obtained from a cycle ergometry cohort study. In total, 110 metabolites (within the classes of acylcarnitines, amino acids, and sugars) were measured through a targeted metabolomics approach, combining tandem mass spectrometry (MS/MS) with the concept of stable isotope dilution (SID) for metabolite quantitation. Biomarker candidates were selected by combined analysis of maximum fold changes (MFCs) in concentrations and P-values resulting from statistical hypothesis testing. Characteristic kinetic signatures were identified through a mathematical modeling approach utilizing polynomial fitting. Modeled kinetic signatures were analyzed for groups with similar behavior by applying hierarchical cluster analysis. Kinetic shape templates were characterized, defining different forms of basic kinetic response patterns, such as sustained, early, late, and other forms, that can be used for metabolite classification. Acetylcarnitine (C2), showing a late response pattern and having the highest values in MFC and statistical significance, was classified as late marker and ranked as strong predictor (MFC = 1.97, P < 0.001). In the class of amino acids, highest values were shown for alanine (MFC = 1.42, P < 0.001), classified as late marker and strong predictor. Glucose yields a delayed response pattern, similar to a hockey stick function, being classified as delayed marker and ranked as moderate predictor (MFC = 1.32, P < 0.001). These findings coincide with existing knowledge on central metabolic pathways affected in exercise physiology, such as ß-oxidation of fatty acids, glycolysis, and glycogenolysis. The presented modeling approach demonstrates high potential for dynamic biomarker identification and the investigation of kinetic mechanisms in disease or pharmacodynamics studies using MS data from longitudinal cohort studies.


Assuntos
Biomarcadores/metabolismo , Redes e Vias Metabólicas/fisiologia , Metaboloma/fisiologia , Metabolômica/métodos , Atividade Motora/fisiologia , Adulto , Algoritmos , Feminino , Glucose/metabolismo , Humanos , Masculino , Pessoa de Meia-Idade , Modelos Biológicos , Espectrometria de Massas em Tandem , Adulto Jovem
13.
J Proteomics ; 91: 500-14, 2013 Oct 08.
Artigo em Inglês | MEDLINE | ID: mdl-23954705

RESUMO

New biomarkers are needed to improve the specificity of prostate cancer detection and characterisation of individual tumors. In a proteomics profiling approach using MALDI-MS tissue imaging on frozen tissue sections, we identified discriminating masses. Imaging analysis of cancer, non-malignant benign epithelium and stromal areas of 15 prostatectomy specimens in a test and 10 in a validation set identified characteristic m/z peaks for each tissue type, e.g. m/z 10775 for benign epithelial, m/z 6284 and m/z 6657.5 for cancer and m/z 4965 for stromal tissue. A 10-fold cross-validation analysis showed highest discriminatory ability to separate tissue types for m/z 6284 and m/z 6657.5, both overexpressed in cancer, and a multicomponent mass peak cluster at m/z 10775-10797.4 overexpressed in benign epithelial tissue. ROC AUC values for these three masses ranged from 0.85 to 0.95 in the discrimination of malignant and non-malignant tissue. To identify the underlying proteins, prostate whole tissue extract was separated by nano-HPLC and subjected to MALDI TOF/TOF analysis. Proteins in fractions containing discriminatory m/z masses were identified by MS/MS analysis and candidate marker proteins subsequently validated by immunohistochemistry (IHC). Biliverdin reductase B (BLVRB) turned out to be overexpressed in PCa tissue. BIOLOGICAL SIGNIFICANCE: In this study on cryosections of radical prostatectomies of prostate cancer patients, we performed a MALDI-MS tissue imaging analysis and a consecutive protein identification of significant m/z masses by nano-HPLC, MALDI TOF/TOF and MS/MS analysis. We identified BLVRB as a potential biomarker in the discrimination of PCa and benign tissue, also suggesting BVR as a feasible therapeutic target.


Assuntos
Regulação Enzimológica da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Oxirredutases atuantes sobre Doadores de Grupo CH-CH/metabolismo , Neoplasias da Próstata/metabolismo , Espectrometria de Massas por Ionização e Dessorção a Laser Assistida por Matriz , Idoso , Área Sob a Curva , Biomarcadores Tumorais , Perfilação da Expressão Gênica , Heme/química , Humanos , Masculino , Pessoa de Meia-Idade , Próstata/metabolismo , Prostatectomia , Sensibilidade e Especificidade
14.
J Natl Cancer Inst ; 105(15): 1142-50, 2013 Aug 07.
Artigo em Inglês | MEDLINE | ID: mdl-23781004

RESUMO

BACKGROUND: Despite the excellent prognosis of Fédération Internationale de Gynécologie et d'Obstétrique (FIGO) stage I, type I endometrial cancers, a substantial number of patients experience recurrence and die from this disease. We analyzed the value of immunohistochemical L1CAM determination to predict clinical outcome. METHODS: We conducted a retrospective multicenter cohort study to determine expression of L1CAM by immunohistochemistry in 1021 endometrial cancer specimens. The Kaplan-Meier method and Cox proportional hazard model were applied for survival and multivariable analyses. A machine-learning approach was used to validate variables for predicting recurrence and death. RESULTS: Of 1021 included cancers, 17.7% were rated L1CAM-positive. Of these L1CAM-positive cancers, 51.4% recurred during follow-up compared with 2.9% L1CAM-negative cancers. Patients bearing L1CAM-positive cancers had poorer disease-free and overall survival (two-sided Log-rank P < .001). Multivariable analyses revealed an increase in the likelihood of recurrence (hazard ratio [HR] = 16.33; 95% confidence interval [CI] = 10.55 to 25.28) and death (HR = 15.01; 95% CI = 9.28 to 24.26). In the L1CAM-negative cancers FIGO stage I subdivision, grading and risk assessment were irrelevant for predicting disease-free and overall survival. The prognostic relevance of these parameters was related strictly to L1CAM positivity. A classification and regression decision tree (CRT)identified L1CAM as the best variable for predicting recurrence (sensitivity = 0.74; specificity = 0.91) and death (sensitivity = 0.77; specificity = 0.89). CONCLUSIONS: To our knowledge, L1CAM has been shown to be the best-ever published prognostic factor in FIGO stage I, type I endometrial cancers and shows clear superiority over the standardly used multifactor risk score. L1CAM expression in type I cancers indicates the need for adjuvant treatment. This adhesion molecule might serve as a treatment target for the fully humanized anti-L1CAM antibody currently under development for clinical use.


Assuntos
Biomarcadores Tumorais/análise , Neoplasias do Endométrio/química , Neoplasias do Endométrio/diagnóstico , Recidiva Local de Neoplasia/química , Recidiva Local de Neoplasia/diagnóstico , Molécula L1 de Adesão de Célula Nervosa/análise , Adulto , Idoso , Braquiterapia , Intervalo Livre de Doença , Neoplasias do Endométrio/mortalidade , Neoplasias do Endométrio/patologia , Neoplasias do Endométrio/terapia , Feminino , Humanos , Histerectomia , Imuno-Histoquímica , Estimativa de Kaplan-Meier , Excisão de Linfonodo , Pessoa de Meia-Idade , Análise Multivariada , Recidiva Local de Neoplasia/mortalidade , Recidiva Local de Neoplasia/prevenção & controle , Estadiamento de Neoplasias , Ovariectomia , Valor Preditivo dos Testes , Modelos de Riscos Proporcionais , Radioterapia Adjuvante , Estudos Retrospectivos , Medição de Risco , Salpingectomia , Sensibilidade e Especificidade
15.
Cancer Treat Rev ; 39(1): 77-88, 2013 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-22789435

RESUMO

OBJECTIVES: Cigarette smoking is the most demonstrated risk factor for the development of lung cancer, while the related genetic mechanisms are still unclear. METHODS: The preprocessed microarray expression dataset was downloaded from Gene Expression Omnibus database. Samples were classified according to the disease state, stage and smoking state. A new computational strategy was applied for the identification and biological interpretation of new candidate genes in lung cancer and smoking by coupling a network-based approach with gene set enrichment analysis. MEASUREMENTS: Network analysis was performed by pair-wise comparison according to the disease states (tumor or normal), smoking states (current smokers or nonsmokers or former smokers), or the disease stage (stages I-IV). The most activated metabolic pathways were identified by gene set enrichment analysis. RESULTS: Panels of top ranked gene candidates in smoking or cancer development were identified, including genes involved in cell proliferation and drug metabolism like cytochrome P450 and WW domain containing transcription regulator 1. Semaphorin 5A and protein phosphatase 1F are the common genes represented as major hubs in both the smoking and cancer related network. Six pathways, e.g. cell cycle, DNA replication, RNA transport, protein processing in endoplasmic reticulum, vascular smooth muscle contraction and endocytosis were commonly involved in smoking and lung cancer when comparing the top ten selected pathways. CONCLUSION: New approach of bioinformatics for biomarker identification and validation can probe into deep genetic relationships between cigarette smoking and lung cancer. Our studies indicate that disease-specific network biomarkers, interaction between genes/proteins, or cross-talking of pathways provide more specific values for the development of precision therapies for lung.


Assuntos
Redes Reguladoras de Genes , Neoplasias Pulmonares/genética , Fumar/genética , Biomarcadores Tumorais/genética , Bases de Dados Genéticas , Feminino , Humanos , Neoplasias Pulmonares/etiologia , Neoplasias Pulmonares/patologia , Masculino , Estadiamento de Neoplasias , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Fumar/efeitos adversos
16.
J Theor Biol ; 310: 216-22, 2012 Oct 07.
Artigo em Inglês | MEDLINE | ID: mdl-22771628

RESUMO

The identification and interpretation of metabolic biomarkers is a challenging task. In this context, network-based approaches have become increasingly a key technology in systems biology allowing to capture complex interactions in biological systems. In this work, we introduce a novel network-based method to identify highly predictive biomarker candidates for disease. First, we infer two different types of networks: (i) correlation networks, and (ii) a new type of network called ratio networks. Based on these networks, we introduce scores to prioritize features using topological descriptors of the vertices. To evaluate our method we use an example dataset where quantitative targeted MS/MS analysis was applied to a total of 52 blood samples from 22 persons with obesity (BMI >30) and 30 healthy controls. Using our network-based feature selection approach we identified highly discriminating metabolites for obesity (F-score >0.85, accuracy >85%), some of which could be verified by the literature.


Assuntos
Algoritmos , Redes e Vias Metabólicas , Metabolômica/métodos , Obesidade/metabolismo , Adulto , Estudos de Casos e Controles , Humanos , Pessoa de Meia-Idade , Modelos Biológicos
17.
ScientificWorldJournal ; 2012: 278352, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22654582

RESUMO

A pooling design can be used as a powerful strategy to compensate for limited amounts of samples or high biological variation. In this paper, we perform a comparative study to model and quantify the effects of virtual pooling on the performance of the widely applied classifiers, support vector machines (SVMs), random forest (RF), k-nearest neighbors (k-NN), penalized logistic regression (PLR), and prediction analysis for microarrays (PAMs). We evaluate a variety of experimental designs using mock omics datasets with varying levels of pool sizes and considering effects from feature selection. Our results show that feature selection significantly improves classifier performance for non-pooled and pooled data. All investigated classifiers yield lower misclassification rates with smaller pool sizes. RF mainly outperforms other investigated algorithms, while accuracy levels are comparable among all the remaining ones. Guidelines are derived to identify an optimal pooling scheme for obtaining adequate predictive power and, hence, to motivate a study design that meets best experimental objectives and budgetary conditions, including time constraints.


Assuntos
Algoritmos , Biologia Computacional/métodos
18.
J Clin Bioinforma ; 1(1): 34, 2011 Dec 19.
Artigo em Inglês | MEDLINE | ID: mdl-22182709

RESUMO

BACKGROUND: In metabolomics, biomarker discovery is a highly data driven process and requires sophisticated computational methods for the search and prioritization of novel and unforeseen biomarkers in data, typically gathered in preclinical or clinical studies. In particular, the discovery of biomarker candidates from longitudinal cohort studies is crucial for kinetic analysis to better understand complex metabolic processes in the organism during physical activity. FINDINGS: In this work we introduce a novel computational strategy that allows to identify and study kinetic changes of putative biomarkers using targeted MS/MS profiling data from time series cohort studies or other cross-over designs. We propose a prioritization model with the objective of classifying biomarker candidates according to their discriminatory ability and couple this discovery step with a novel network-based approach to visualize, review and interpret key metabolites and their dynamic interactions within the network. The application of our method on longitudinal stress test data revealed a panel of metabolic signatures, i.e., lactate, alanine, glycine and the short-chain fatty acids C2 and C3 in trained and physically fit persons during bicycle exercise. CONCLUSIONS: We propose a new computational method for the discovery of new signatures in dynamic metabolic profiling data which revealed known and unexpected candidate biomarkers in physical activity. Many of them could be verified and confirmed by literature. Our computational approach is freely available as R package termed BiomarkeR under LGPL via CRAN http://cran.r-project.org/web/packages/BiomarkeR/.

19.
Biol Direct ; 6: 53, 2011 Oct 13.
Artigo em Inglês | MEDLINE | ID: mdl-21995640

RESUMO

BACKGROUND: Identifying group-specific characteristics in metabolic networks can provide better insight into evolutionary developments. Here, we present an approach to classify the three domains of life using topological information about the underlying metabolic networks. These networks have been shown to share domain-independent structural similarities, which pose a special challenge for our endeavour. We quantify specific structural information by using topological network descriptors to classify this set of metabolic networks. Such measures quantify the structural complexity of the underlying networks. In this study, we use such measures to capture domain-specific structural features of the metabolic networks to classify the data set. So far, it has been a challenging undertaking to examine what kind of structural complexity such measures do detect. In this paper, we apply two groups of topological network descriptors to metabolic networks and evaluate their classification performance. Moreover, we combine the two groups to perform a feature selection to estimate the structural features with the highest classification ability in order to optimize the classification performance. RESULTS: By combining the two groups, we can identify seven topological network descriptors that show a group-specific characteristic by ANOVA. A multivariate analysis using feature selection and supervised machine learning leads to a reasonable classification performance with a weighted F-score of 83.7% and an accuracy of 83.9%. We further demonstrate that our approach outperforms alternative methods. Also, our results reveal that entropy-based descriptors show the highest classification ability for this set of networks. CONCLUSIONS: Our results show that these particular topological network descriptors are able to capture domain-specific structural characteristics for classifying metabolic networks between the three domains of life.


Assuntos
Archaea/classificação , Bactérias/classificação , Eucariotos/classificação , Redes e Vias Metabólicas , Algoritmos , Análise de Variância , Archaea/metabolismo , Inteligência Artificial , Bactérias/metabolismo , Gráficos por Computador , Eucariotos/metabolismo , Modelos Logísticos , Reprodutibilidade dos Testes , Software
20.
J Clin Bioinforma ; 1(1): 2, 2011 Jan 20.
Artigo em Inglês | MEDLINE | ID: mdl-21884622

RESUMO

The search and validation of novel disease biomarkers requires the complementary power of professional study planning and execution, modern profiling technologies and related bioinformatics tools for data analysis and interpretation. Biomarkers have considerable impact on the care of patients and are urgently needed for advancing diagnostics, prognostics and treatment of disease. This survey article highlights emerging bioinformatics methods for biomarker discovery in clinical metabolomics, focusing on the problem of data preprocessing and consolidation, the data-driven search, verification, prioritization and biological interpretation of putative metabolic candidate biomarkers in disease. In particular, data mining tools suitable for the application to omic data gathered from most frequently-used type of experimental designs, such as case-control or longitudinal biomarker cohort studies, are reviewed and case examples of selected discovery steps are delineated in more detail. This review demonstrates that clinical bioinformatics has evolved into an essential element of biomarker discovery, translating new innovations and successes in profiling technologies and bioinformatics to clinical application.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...